WebContent: efficient P2P Warehousing of web data

نویسندگان

  • Serge Abiteboul
  • Tristan Allard
  • Philippe Chatalic
  • Georges Gardarin
  • A. Ghitescu
  • François Goasdoué
  • Ioana Manolescu
  • Benjamin Nguyen
  • M. Ouazara
  • A. Somani
  • Nicolas Travers
  • Gabriel Vasile
  • Spyros Zoupanos
چکیده

We present the WebContent platform for managing distributed repositories of XML and semantic Web data. The platform allows integrating various data processing building blocks (crawling, translation, semantic annotation, full-text search, structured XML querying, and semantic querying), presented as Web services, into a large-scale efficient platform. Calls to various services are combined inside ActiveXML [8] documents, which are XML documents including service calls. An ActiveXML optimizer is used to: (i) efficiently distribute computations among sites; (ii) perform XQuery-specific optimizations by leveraging an algebraic XQuery optimizer; and (iii) given an XML query, chose among several distributed indices the most appropriate in order to answer the query.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Warehousing Web Resources with the WebContent Platform

We describe the WebContent platform for the management of content from the Web. The platform is based on a service-oriented architecture and Web standards (notably, Web services, XML and RDF). An enterprise service bus (following the JBI specification) and BEPL may be used to orchestrate service invocations. A peerto-peer architecture may also be used to facilitate cooperation between independe...

متن کامل

The WebContent XML Store

In this article, we describe the XML storage system used in the WebContent project. We begin by advocating the use of an XML database in order to store WebContent documents, and we present two different ways of storing and querying these documents : the use of a centralized XML database and the use of a P2P XML database.

متن کامل

A Data Warehousing and Data Mining Framework for Web Usage Management∗

A new challenge in Web usage analysis is how to manage and discover informative patterns from various types of Web data stored in structured or unstructured databases for system monitoring and decision making. In this paper, a novel integrated data warehousing and data mining framework for Website management and patterns discovery is introduced to analyze Web user behavior. The merit of the fra...

متن کامل

Proposed Quality Evaluation Framework to Incorporate Quality Aspects in Web Warehouse Creation

Web Warehouse is a read only repository maintained on the web to effectively handle the relevant data. Web warehouse is a system comprised of various subsystems and process. It supports the organizations in decision making. Quality of data store in web warehouse can affect the quality of decision made. For a valuable decision making it is required to consider the quality aspects in designing an...

متن کامل

A Novel Caching Strategy in Video-on-Demand (VoD) Peer-to-Peer (P2P) Networks Based on Complex Network Theory

The popularity of video-on-demand (VoD) streaming has grown dramatically over the World Wide Web. Most users in VoD P2P networks have to wait a long time in order to access their requesting videos. Therefore, reducing waiting time to access videos is the main challenge for VoD P2P networks. In this paper, we propose a novel algorithm for caching video based on peers' priority and video's popula...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • PVLDB

دوره 1  شماره 

صفحات  -

تاریخ انتشار 2008